诸如说服力之类的复杂对话设置涉及交流态度或行为的变化,因此即使与主题没有直接相关,用户的观点也需要解决。在这项工作中,我们贡献了一个新颖的模块化对话系统框架,该框架将事实信息和社会内容无缝地整合到有说服力的对话中。我们的框架可以推广到任何混合社交和任务内容的对话任务。我们进行了一项研究,将用户对框架的评估与基线端到端生成模型进行了比较。我们发现,与没有明确处理社交内容或事实问题的端到端模型相比,我们的框架在包括能力和友善的各个方面更受欢迎。
translated by 谷歌翻译
随着语言模型的不断增加,它对于保护这些模型免于泄漏私人信息变得至关重要。以前的工作试图通过培训具有不同隐私保证的基于RNN的语言模型来应对这一挑战。但是,将经典的差异隐私应用于语言模型会导致模型性能差,因为基本隐私概念过于困惑,并且为数据中所有令牌提供了不体化的保护。鉴于自然语言中的私人信息很少(例如,电子邮件的大部分可能无法携带个人身份信息),我们提出了一个新的隐私概念,选择性差异隐私,以提供严格的数据,以保证数据的敏感部分改善模型实用程序。为了实现这样一个新的概念,我们为基于RNN的语言模型开发了相应的隐私机制,即选择性DPSGD。除了语言建模外,我们还将方法应用于更具体的应用程序 - dialog系统。语言建模和对话系统建设的实验表明,与基线相比,在各种隐私攻击下,提议的保留隐私机制可以实现更好的公用事业,同时保持安全。数据和代码在https://github.com/wyshi/lm_privacy上发布,以促进未来的研究。
translated by 谷歌翻译
Using chatbots to deliver recommendations is increasingly popular. The design of recommendation chatbots has primarily been taking an information-centric approach by focusing on the recommended content per se. Limited attention is on how social connection and relational strategies, such as self-disclosure from a chatbot, may influence users' perception and acceptance of the recommendation. In this work, we designed, implemented, and evaluated a social chatbot capable of performing three different levels of self-disclosure: factual information (low), cognitive opinions (medium), and emotions (high). In the evaluation, we recruited 372 participants to converse with the chatbot on two topics: movies and COVID-19 experiences. In each topic, the chatbot performed small talks and made recommendations relevant to the topic. Participants were randomly assigned to four experimental conditions where the chatbot used factual, cognitive, emotional, and adaptive strategies to perform self-disclosures. By training a text classifier to identify users' level of self-disclosure in real-time, the adaptive chatbot can dynamically match its self-disclosure to the level of disclosure exhibited by the users. Our results show that users reciprocate with higher-level self-disclosure when a recommendation chatbot consistently displays emotions throughout the conversation. Chatbot's emotional disclosure also led to increased interactional enjoyment and more positive interpersonal perception towards the bot, fostering a stronger human-chatbot relationship and thus leading to increased recommendation effectiveness, including a higher tendency to accept the recommendation. We discuss the understandings obtained and implications to future design.
translated by 谷歌翻译
建立社会智能的代理人涉及许多挑战。其中之一是跟踪代理商的精神状态过渡,并教给代理人像人类一样以其价值为指导的决定。为此,我们建议将心理状态模拟和价值建模纳入对话代理。首先,我们建立了一个混合精神状态解析器,该解析器从对话和事件观察中提取信息,并保持代理人思想的图形表示;同时,基于变压器的价值模型从人类价值数据集Valuenet中学习人类的偏好。经验结果表明,所提出的模型在幻想文本冒险游戏数据集中的对话/动作/情感预测任务上达到了最先进的表现。我们还展示了示例案例以证明:(i)拟议的精神状态解析器如何通过基于位置和物体等环境来帮助代理商的决定,以及(ii)价值模型如何帮助代理商根据其个人个人做出决策优先事项。
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities. Prior methods successfully preserve the character of clothing images, however, occlusion remains a pernicious effect for realistic virtual try-on. In this work, we first present a comprehensive analysis of the occlusions and categorize them into two aspects: i) Inherent-Occlusion: the ghost of the former cloth still exists in the try-on image; ii) Acquired-Occlusion: the target cloth warps to the unreasonable body part. Based on the in-depth analysis, we find that the occlusions can be simulated by a novel semantically-guided mixup module, which can generate semantic-specific occluded images that work together with the try-on images to facilitate training a de-occlusion try-on (DOC-VTON) framework. Specifically, DOC-VTON first conducts a sharpened semantic parsing on the try-on person. Aided by semantics guidance and pose prior, various complexities of texture are selectively blending with human parts in a copy-and-paste manner. Then, the Generative Module (GM) is utilized to take charge of synthesizing the final try-on image and learning to de-occlusion jointly. In comparison to the state-of-the-art methods, DOC-VTON achieves better perceptual quality by reducing occlusion effects.
translated by 谷歌翻译
Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.
translated by 谷歌翻译
As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy. In some cases, there may be unmeasured variables that can confound the action-reward or action-next-state relationships, rendering many existing OPE approaches ineffective. This paper develops an instrumental variable (IV)-based method for consistent OPE in confounded Markov decision processes (MDPs). Similar to single-stage decision making, we show that IV enables us to correctly identify the target policy's value in infinite horizon settings as well. Furthermore, we propose an efficient and robust value estimator and illustrate its effectiveness through extensive simulations and analysis of real data from a world-leading short-video platform.
translated by 谷歌翻译
Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline data generated by a potentially different behavior policy. It is critical in a number of sequential decision making problems ranging from healthcare to technology industries. Most of the work in existing literature is focused on evaluating the mean outcome of a given policy, and ignores the variability of the outcome. However, in a variety of applications, criteria other than the mean may be more sensible. For example, when the reward distribution is skewed and asymmetric, quantile-based metrics are often preferred for their robustness. In this paper, we propose a doubly-robust inference procedure for quantile OPE in sequential decision making and study its asymptotic properties. In particular, we propose utilizing state-of-the-art deep conditional generative learning methods to handle parameter-dependent nuisance function estimation. We demonstrate the advantages of this proposed estimator through both simulations and a real-world dataset from a short-video platform. In particular, we find that our proposed estimator outperforms classical OPE estimators for the mean in settings with heavy-tailed reward distributions.
translated by 谷歌翻译